Democratizing Our Data by Julia Lane

Democratizing Our Data by Julia Lane

Author:Julia Lane [Lane, Julia]
Language: eng
Format: epub
Publisher: MIT Press


Setting Up Access

Protecting privacy is a paramount concern in data collection. But it is also clear that citizens are aware of the tradeoff between privacy and utility, particularly if the tradeoff is explained. In a recent report commissioned by the Data Funders Collaborative, the consulting firm Topos reported a private citizen saying, “I think in our current world we have to be concerned about data security. There are too many breaches not to be concerned. But I can absolutely see how social services having access to a child’s school and health records could keep a child safer and maybe save a life.”18 This quote encapsulates the lessons learned in the Allegheny County example, where the county worked with its constituents to make better use of data to allocate resources and better protect children.

Allowing access to confidential data—which is the data on individuals and businesses that is necessary to generate the measures required of most statistical agencies—is hard, because it’s necessary to ensure that individuals and organizations cannot be reidentified. This is quite a challenge—simple information like zip code, date of birth, and sex can be sufficient to reidentify nine out of ten Americans.19 Even a decade ago, individual browsing habits—without any other identifiers—were used to find out an AOL user’s identity.20 And in 2016, the New York Times Magazine reported that no more information was needed by Target than a sixteen-year-old’s shopping habits to find out that she was pregnant before her father even knew.21

An important historical approach to keeping data safe has been to allow access only onsite at federal agencies or designated secure centers.22 The idea was that an authorized agency representative could then oversee researchers as they physically sat in a secure environment. Although the approach has been successful in expanding access to federal data for scientists, it tends to favor larger, wealthier universities and researchers who are able to travel to a physical center to do their work. The access is largely to federal agency data, not to the state or local data that are often essential for decision-making, or the increasingly relevant new types of data (retail scanner, sensor, cell phone or social media). Such inadequate access has effectively limited innovation. However, this limitation can be made less stringent by using new technology to create better access while protecting privacy by following what is referred to as the “five safes” framework: safe projects, safe people, safe settings, safe data, and safe outputs.23

Technology can help with addressing the first “safe”: “safe projects.” A safe project creates value for the agency and the people it serves and is consistent with current legal, policy, ethical, and other relevant restrictions. In other words, the starting point is to ensure that a project has utility. Without that utility, no data access occurs. Once a safe project is identified and agreed to, the people who are authorized to access the data for those purposes are identified, the datasets are defined, and the protocols for protecting the data are also agreed to. The rules can be encoded and tracked at a project level for the next “safe”—safe people.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.